89 research outputs found

    Topic and language specific internet search engine

    Get PDF
    In this paper we present the result of our project that aims to build a categorization-based topic-oriented Internet search engine. Particularly, we focus on the economic related electronic materials available on the Internet in Hungarian. We present our search service that harvests, stores and makes searchable the publicly available contents of the subject domain. The paper describes the search facilities and the structure of the implemented system with special emphasis on intelligent search algorithms and document processing methods

    Recommender systems challenge 2014

    Get PDF

    Benchmarking: A methodology for ensuring the relative quality of recommendation systems in software engineering

    Get PDF
    This chapter describes the concepts involved in the process of benchmarking of recommendation systems. Benchmarking of recommendation systems is used to ensure the quality of a research system or production system in comparison to other systems, whether algorithmically, infrastructurally, or according to any sought-after quality. Specifically, the chapter presents evaluation of recommendation systems according to recommendation accuracy, technical constraints, and business values in the context of a multi-dimensional benchmarking and evaluation model encompassing any number of qualities into a final comparable metric. The focus is put on quality measures related to recommendation accuracy, technical factors, and business values. The chapter first introduces concepts related to evaluation and benchmarking of recommendation systems, continues with an overview of the current state of the art, then presents the multi-dimensional approach in detail. The chapter concludes with a brief discussion of the introduced concepts and a summary

    Simple tricks for improving pattern-based information extraction from the biomedical literature

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Pattern-based approaches to relation extraction have shown very good results in many areas of biomedical text mining. However, defining the right set of patterns is difficult; approaches are either manual, incurring high cost, or automatic, often resulting in large sets of noisy patterns.</p> <p>Results</p> <p>We propose several techniques for filtering sets of automatically generated patterns and analyze their effectiveness for different extraction tasks, as defined in the recent BioNLP 2009 shared task. We focus on simple methods that only take into account the complexity of the pattern and the complexity of the texts the patterns are applied to. We show that our techniques, despite their simplicity, yield large improvements in all tasks we analyzed. For instance, they raise the F-score for the task of extraction gene expression events from 24.8% to 51.9%.</p> <p>Conclusions</p> <p>Already very simple filtering techniques may improve the F-score of an information extraction method based on automatically generated patterns significantly. Furthermore, the application of such methods yields a considerable speed-up, as fewer matches need to be analysed. Due to their simplicity, the proposed filtering techniques also should be applicable to other methods using linguistic patterns for information extraction.</p

    Workshop on reproducibility and replication in recommender systems evaluation - RepSys

    Full text link
    This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in RecSys '13 Proceedings of the 7th ACM conference on Recommender systems, http://dx.doi.org/10.1145/2507157.2508006.Experiment replication and reproduction are key requirements for empirical research methodology, and an important open issue in the field of Recommender Systems. When an experiment is repeated by a different researcher and exactly the same result is obtained, we can say the experiment has been replicated. When the results are not exactly the same but the conclusions are compatible with the prior ones, we have a reproduction of the experiment. Reproducibility and replication involve recommendation algorithm implementations, experimental protocols, and evaluation metrics. While the problem of reproducibility and replication has been recognized in the Recommender Systems community, the need for a clear solution remains largely unmet, which motivates the present workshop.This workshop was carried out during the tenure of an ERCIM “Alain Bensoussan” Fellowship Programme, funded by European Comission FP7 grant agreement no.246016

    Szintaktikailag elemzett birtokos kifejezések algoritmizált fordítása adott formális nyelvre

    Get PDF
    Számos nemzetközi szakirodalom [5; 7; 10; 17; 20] foglakozott a birtokos szerkezetek szemantikai modellezésével, szemantikai sajátosságainak bemutatásával, azonban az eddig megalkotott modellek valamely konkrét birtokos szerkezetnek pontosan megfelelő formális mondat automatizált előállítását nem biztosítják. A cikkben megmutatjuk, hogyan lehet a problémát általános formában megoldani, illetve megmutatjuk, hogy az algoritmussal támogatott feldolgozásnak hol vannak a korlátai, melyek a még megoldandó feladatok

    User-Item Reciprocity in Recommender Systems: Incentivizing the Crowd

    Get PDF
    Data consumption has changed significantly in the last 10 years. The digital revolution and the Internet has brought an abundance of information to users. Recommender systems are a popular means of finding content that is both relevant and personalized. However, today’s users require better recommender systems, able of producing continuous data feeds keeping up with their instantaneous and mobile needs. The CrowdRec project addresses this demand by providing context-aware, resource-combining, socially-informed, interactive and scalable recommendations. The key insight of CrowdRec is that, in order to achieve the dense, high-quality, timely information required for such systems, it is necessary to move from passive user data collection, to more active techniques fostering user engagement. For this purpose, CrowdRec activates the crowd, soliciting input and feedback from the wider communit

    Kontextualizált névelem-felismerés és relációkinyerés kórházi zárójelentésekben

    Get PDF
    Cikkünkben a kórházi zárójelentések szövegbányászati feldolgozásával foglalkozó i2b2 szervezet 2010-es, információkinyeréssel kapcsolatos feladatára (Fourth i2b2/VA Shared-Task) készített megoldásunkat ismertetjük. Az első, névelem-felismerési feladatban három entitástípus szövegbeli előfordulásait, pontosabban egy szűk bennfoglaló nyelvtani egységet kellett megjelölni. A második, állításosztályozási feladatban ezen entitások említésének jellegét (kijelentő, tagadó, spekulatív stb.) kellett osztályozni. Végül a harmadik, relációkinyerési feladatban az egy mondatban szereplő entitások között fennálló kapcsolat meglétét és pozitív esetben a típusát kellett megállapítani. Megoldásainkban kontextusra épülő, a rendelkezésünkre bocsátott tanítóadaton betanított – részben szabályalapú, részben felügyelt gépi tanuláson alapuló – módszereket alkalmaztunk. Munkánkban elemezzük az egyes eljárások hatékonyságát és megvizsgálunk néhány lehetséges továbbfejlesztési irányt